\section{Conclusion}

Relation clustering is crucial to many natural language processing
applications, including relation extraction, question answering,
summarization, and machine translation. Unfortunately, previous
approaches based on distribution similarity and parallel corpora,
often produce low precision clusters. This paper introduces three
Temporal Correspondence Heuristics that characterize semantically
equivalent phrases in news streams. We present a novel algorithm,
\sys, based on a probabilistic graphical model encoding these
heuristics, which harvests high-quality relation clusters.


% The joint model of \sys\ is also flexible in its ability to include
% new features and constraints.

Experiments show \sys's improvement relative to several other methods,
especially at producing lexically diverse clusters.  Ablation tests
confirm that our temporal features are crucial to \sys's precision. In
order to spur future research, we are releasing an annotated corpus of
time-stamped news articles and our harvested relation clusters.

%%  Additionally, this work introduces a new evaluation
%% metric which considers only diversified relation clusters which are
%% more useful but are challenging to generate. Empirical study shows
%% that \sys\ outperforms other methods in precision by considerable
%% margins, especially on the diversified relation clusters. This
%% demonstrates the power of the proposed temporal heuristics and the
%% effectiveness of our joint model. 




%For example, textual entailment~\cite{dagan2009recognizing}, which finds a phrase inferring another phrase, is very related to relation clustering task. It is natural to conjecture that certain temporal hypotheses are useful to textual entailment; relation clusters are used in different tasks in various ways, it is interesting to develop a general interface to improve the performances of different downstream NLP tasks.






%We argue that weak supervision is promising method for scaling
%information extraction to the level where it can handle the myriad,
%different relations on the Web.  By using the contents of a database
%to heuristically label a training corpus, we may be able to
%automatically learn a nearly unbounded number of relational
%extractors.  Since the processs of matching database tuples to
%sentences is inherently heuristic, researchers have proposed
%multi-instance learning algorithms as a means for coping with the
%resulting noisy data. Unfortunately, previous approaches assume that
%all relations are {\em disjoint} --- for example they cannot extract
%the pair {\tt Founded(Jobs, Apple)} and {\tt CEO-of(Jobs, Apple)},
%because two relations are not allowed to have the same arguments.

%This paper presents a novel approach for multi-instance learning with
%overlapping relations that combines a sentence-level extraction model
%with a simple, corpus-level component for aggregating the individual
%facts.  We apply our model to learn extractors for NY Times text using
%weak supervision from Freebase. Experiments show improvements
%for both sentential and aggregate (corpus level) extraction, and demonstrate
%that the approach is computationally efficient.

%Our early progress suggests many interesting directions.
%By joining two or more Freebase tables, we can generate many more
%matches and learn more relations. We also wish to refine our model in
%order to improve precision. For example, we would like to add type
%reasoning about entities and selectional preference constraints for
%relations.  Finally, we are also interested in applying the overall learning
%approaches to other tasks that could be modeled with weak supervision,
%such as coreference and named entity classification.
